Job Description: Job title: Site Reliability Manager
Hybrid/Onsite: Dallas, TX
COG Assets
Top Qualifications
Job Summary
We are seeking a Site Reliability Manager with 8 to 12 years of experience to join our team. The ideal candidate will have expertise in Database Design, MySQL, Node.js, Kubernetes, iPaaS, Dynatrace, Azure CI, Moogsoft, GITHUB, MongoDB, and PostgreSQL. This role involves managing geospatial data projects, ensuring data integrity, and leveraging advanced technologies to drive business outcomes.
Required Skills: MySQL, Node.js, Kubernetes, iPaaS, Dynatrace, Azure CI, Moogsoft, GITHUB, MongoDB, PostgreSQL
Roles & Responsibilities
Make monitoring and alerting notify on symptoms and not on outages.
Document so your findings turn into repeatable actions–and then into automation.
Improve the deployment process, change mgmt., release mgmt. processes to make it efficient and streamlined.
Debug production issues across services and levels of the stack.
Proposes ideas and solutions within the product team to improve resiliency, availability, security.
Plan and execute configuration change operations both at the application and the infrastructure level.
Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation
Complete Root Cause Analysis (RCA) investigations
Improving DevSecOps practices and accelerating delivery and take a lead role in troubleshooting technical issues
Assist in providing inputs to develop strategic technology roadmaps
Respond to incidents and provide support for customer incidents
Must to have
- Implement Github, GitAction CI or CD and ADO cloud for automation
- Implementing monitoring, observability in AKS and Azure cloud, Kubernetes
- Monitoring and Metrics in Dynatrace, Prometheus, Grafana and integrations with Moogsoft or xMatters
- Open source Logging infrastructure
- Worked in an environment with Node JS and GQL with for 2 years of experience
- Hands-on experience with Infrastructure as a Service (IaaS), Platform as a Service (PaaS) tools and platforms, and containers and container orchestration platforms (aka Docker & Kubernetes)
- Expertise in one or more cloud native relational databases such as MySql, PostgreSql and NoSQL databases such as Cassandra and MongoDB highly desired
- Strong technical knowledge and skills that are broad and deep, covering various hardware, software, and technology platforms
- Develop, implement, and maintain applications and systems that integrate MongoDB
- Dynatrace
- Mezmo
- Security Vulnerabilities (remediation or compliance)
Good to have
- Terraform in Azure and on-prem infrastructure resources
- Load balancing the application including Proxies and CDN (automate)
- Able to script Automated performance testing scenarios for APIs and Web front ends and embed in CI/CD pipelines dashboarding/reporting query languages
- Airline Industry experience helpful
- Typescript, JavaScript
- Database and persistence frameworks: Mongo, Oracle, Object/Relational Mapping, Query performance tuning
- Experience with Mongo Schema Design and Mongo Aggregation Framework
- Web Services: Graph QL, REST/SOAP (JSON/WSDL/XML)
- DB Admin/SQL Server, Terraform, SysAdmin, Troubleshooting Network Issues, VM Management
Additional Sills: Skills:
Category
Name
Required
Importance
Experience
No items to display.